Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes

نویسنده

  • Hyeong Soo Chang
چکیده

We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the “localization” concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient-ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent’s decision is based on only its local state.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

A Distributed Algorithm for Solving a Class of Multi-agent Markov Decision Problems

We consider a class of infinite horizon Markov decision processes (MDPs) with multiple decision makers, called agents, and a general joint reward structure, but a special decomposable state/action structure such that each individual agent’s actions affect the system’s state transitions independently from the actions of all other agents. We introduce the concept of “localization,” where each age...

متن کامل

A Hybrid Algorithm using Firefly, Genetic, and Local Search Algorithms

In this paper, a hybrid multi-objective algorithm consisting of features of genetic and firefly algorithms is presented. The algorithm starts with a set of fireflies (particles) that are randomly distributed in the solution space; these particles converge to the optimal solution of the problem during the evolutionary stages. Then, a local search plan is presented and implemented for searching s...

متن کامل

Distributed Generation Expansion Planning Considering Load Growth Uncertainty: A Novel Multi-Period Stochastic Model

Abstract – Distributed generation (DG) technology is known as an efficient solution for applying in distribution system planning (DSP) problems. Load growth uncertainty associated with distribution network is a significant source of uncertainty which highly affects optimal management of DGs. In order to handle this problem, a novel model is proposed in this paper based on DG solution, consideri...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003